A Power Efficiency Enhancements of a Multi-Bit Accelerator for Memory Prohibitive Deep Neural Networks
نویسندگان
چکیده
Convolutional Neural Networks (CNN) are widely employed in the contemporary artificial intelligence systems. However these models have millions of connections between layers, that both memory prohibitive and computationally expensive. Employing on an embedded mobile application is resource limited with high power consumption significant bandwidth requirement to access data from off-chip DRAM. Reducing movement on-chip DRAM main criteria achieve throughput overall better energy efficiency. Our proposed multi-bit accelerator achieves goals by employing truncation partial sum (Psum) results preceding layer before feeding it into next layer. We exhibit architecture inferencing 32-bits for first convolution layers sequentially truncate bits MSB/LSB integer fractional part without any further training original network. At last fully connected layer, top-1 accuracy maintained reduced bit width 14 top-5 upto 10-bit width. The computation engine consists systolic array 1024 processing elements (PE). Large CNNs such as AlexNet, MobileNet, SqueezeNet EfficientNet were used benchmark CNN model Virtex Ultrascale FPGA was test architecture. scheme has 49% reduction utilization 73.25% LUTs (Look-up tables), 68.76% FFs (Flip-Flops), 74.60% BRAMs (Block RAMs) 79.425% Digital Signal Processors (DSPs) when compared 32 design a performance 223.69 GOPS FPGA, gain 3.63 × other prior accelerators. In addition, 4.5 lower architectures. ASIC version designed 22nm FDSOI CMOS process 2.03 TOPS/W total 791 mW area 1 mm 1.2 mm.
منابع مشابه
Towards a Low Power Hardware Accelerator for Deep Neural Networks
In this project, we take a first step towards building a low power hardware accelerator for deep learning. We focus on RBM based pretraing of deep neural networks and show that there is significant robustness to random errors in the pre-training, training and testing phase of using such neural networks. We propose to leverage such robustness to build accelerators using low power but possibly un...
متن کاملSnowflake: A Model Agnostic Accelerator for Deep Convolutional Neural Networks
Deep convolutional neural networks (CNNs) are the deep learning model of choice for performing object detection, classification, semantic segmentation and natural language processing tasks. CNNs require billions of operations to process a frame. This computational complexity, combined with the inherent parallelism of the convolution operation make CNNs an excellent target for custom accelerator...
متن کاملA Low-Power Accelerator for Deep Neural Networks with Enlarged Near-Zero Sparsity
It remains a challenge to run Deep Learning in devices with stringent power budget in the Internet-of-Things. This paper presents a low-power accelerator for processing Deep Neural Networks in the embedded devices. The power reduction is realized by avoiding multiplications of near-zero valued data. The near-zero approximation and a dedicated Near-Zero Approximation Unit (NZAU) are proposed to ...
متن کاملImproving Power Generation Efficiency using Deep Neural Networks
Recently there has been significant research on power generation, distribution and transmission efficiency especially in the case of renewable resources. The main objective is reduction of energy losses and this requires improvements on data acquisition and analysis. In this paper we address these concerns by using consumers’ electrical smart meter readings to estimate network loading and this ...
متن کاملGENES IV: A bit-serial processing element for a multi-model neural-network accelerator
A systolic array of dedicated processing elements (PEs) is presented as the heart of a multi-model neural-network accelerator. The instruction set of the PEs allows the implementation of several widely-used neural models, including multi-layer Perceptrons with the backpropagation learning rule and Kohonen feature maps. Each PE holds an element of the synaptic weight matrix. An instantaneous swa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE open journal of circuits and systems
سال: 2021
ISSN: ['2644-1225']
DOI: https://doi.org/10.1109/ojcas.2020.3047225